General Bounds on Statistical Query Learning and PAC Learning with Noise via Hypothesis Bounding
نویسندگان
چکیده
We derive general bounds on the complexity of learning in the Statistical Query model and in the PAC model with classification noise. We do so by considering the problem of boosting the accuracy of weak learning algorithms which fall within the Statistical Query model. This new model was introduced by Kearns [12] to provide a general framework for efficient PAC learning in the presence of classification noise. We first show a general scheme for boosting the accuracy of weak SQ learning algorithms, proving that weak SQ learning is equivalent to strong SQ learning. The boosting is efficient and is used to show our main result of the first general upper bounds on the complexity of strong SQ learning. Specifically, we derive simultaneous upper bounds with respect to 6 on the number of queries, O(log2:), the Vapnik-Chervonenkis dimension of the query space, O(1og log log +), and the inverse of the minimum tolerance, O(+ log 3 ) . In addition, we show that these general upper bounds are nearly optimal by describing a class of learning problems for which we simultaneously lower bound the number of queries by R(1og f ) and the inverse of the minimum tolerance by a(:). We further apply our boosting results in the SQ model to learning in the PAC model with classification noise. Since nearly all PAC learning algorithms can be cast in the SQ model, we can apply our boosting techniques to convert these PAC algorithms into highly efficient SQ algorithms. By simulating these efficient SQ algorithms in the PAC model with classification noise, we show that nearly all PAC algorithms can be converted into highly efficient PAC algorithms which *Author was supported by DARPA Contract N00014-87-K825 and by NSF Grant CCR-89-14428. Author’s net address: jaaQtheory.lca.rit.edu +.Author was supported by an NDSEG Fellowship and by NSF Grant CCR-92-00884. Author’s net address: aedQdas.harvard.edu Scott E. Decaturt Aiken Computation Laboratory
منابع مشابه
Noise tolerant algorithms for learning and searching
We consider the problem of developing robust algorithms which cope with noisy data. In the Probably Approximately Correct model of machine learning, we develop a general technique which allows nearly all PAC learning algorithms to be converted into highly e cient PAC learning algorithms which tolerate noise. In the eld of combinatorial algorithms, we develop techniques for constructing search a...
متن کاملStatistical Query Learning (1993; Kearns)
The problem deals with learning {−1, +1}-valued functions from random labeled examples in the presence of random noise in the labels. In the random classification noise model of of Angluin and Laird [1] the label of each example given to the learning algorithm is flipped randomly and independently with some fixed probability η called the noise rate. The model is the extension of Valiant’s PAC m...
متن کاملAn Effective Approach for Robust Metric Learning in the Presence of Label Noise
Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...
متن کاملCost Complexity of Proactive Learning via a Reduction to Realizable Active Learning
Proactive Learning is a generalized form of active learning with multiple oracles exhibiting different reliabilities (label noise) and costs. We propose a general approach for Proactive Learning that explicitly addresses the cost vs. reliability tradeoff for oracle and instance selection. We formulate the problem in the PAC learning framework with bounded noise, and transform it into realizable...
متن کاملNearly Tight Bounds on $\ell_1$ Approximation of Self-Bounding Functions
We study the complexity of learning and approximation of self-bounding functions over the uniform distribution on the Boolean hypercube {0, 1}n. Informally, a function f : {0, 1}n → R is self-bounding if for every x ∈ {0, 1}n, f(x) upper bounds the sum of all the n marginal decreases in the value of the function at x. Self-bounding functions include such well-known classes of functions as submo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Comput.
دوره 141 شماره
صفحات -
تاریخ انتشار 1993